Ploting with the matplotlib library

In this lecture we will create graphs with the matplotlib library [ wiki ]

Before we begin, we should mention that "plotting" does not traditionally belong to the course of programming, but to that of data analysis. However, programming in biology and in science in general usually contains data analysis.

Plotting is art!

A good plot has nothing to do with being a good programmer or having qualitative data (although both help). Good plotting has to do with your aesthetics, your perception of colors and your sense of proportion.

That is why before (or even after ...) this lecture you should take a look at the following:

Also you should take a look of how NOT to make plots:

The usual way to import matplotlib is:

Then for each new plot we must create the following two objects fig and ax

In general: matplotlib has three basic items for handling plots:

Let's make an empty plot!

To display a plot we use the show() method:

ax.plot accepts a wide variety of arguments. The first two arguments are lists. The first contains the X-axis coordinates of the elements we want to plot, and the second the Y-axis coordinates, e.g. To display a zigzag line passing through the points: (1,3), (2,5), (3,0):

plot also accepts a third argument which is the "style" of the line. It consists of two parts: color and style. The color can be ( http://matplotlib.org/api/colors_api.html ):

In case we want to plot a line then the style can be either one of ( http://matplotlib.org/api/lines_api.html#matplotlib.lines.Line2D.set_linestyle ):

In case we want to plot only the points (and not lines) then the options are ( http://matplotlib.org/api/markers_api.html ):

For example a red dotted line:

Markers as blue pentagons:

Of course there is a better way to define a color, with the parameter c:

Yes, there is a color called "peru". A full list of names can be found here: ( http://matplotlib.org/examples/color/named_colors.html ). Alternatively you can use a value from 0.0 to 1.0 to print to a "grayscale" where 0.0 is black and 1.0 is white:

Of course you can use any RGB color . There are many sites where you can choose a color eg: http://htmlcolorcodes.com/

Plot can be used many times:

We can also plot a function f by computing x, and y=f(x):

A better way to plot functions is to use numpy 's linspace. The linspace (a, b, c) creates an arithmetic progression from a to b, so that there are a total of c elements.

The value 100 in linspace can also be seen as the "graph resolution":

Notice how "broken" the graph looks

With the function ax.set_xlim, ax.set_ylim we can change the boundaries of the axes:

We can also put labels on the axes and over the plot:

You can specify size, font and style in labels

With ax.set_xticks you can specify which ticks will appear on each axis:

You can also change the tick label:

The plot function returns a table of legends. We can add these legends to the plot with plt.legend:

More about the loc (location of the legend) see here: http://matplotlib.org/api/legend_api.html#matplotlib.legend.Legend

We can also add a grid:

Add a text at point X, Y:

There is also the annotate function with which you can draw arrows:

Often, we want to print two plots that share the same axis. Suppose e.g. that we want to share the X axis:

We can also embed a whole new plot inside another with the command fig.add_axes (). add_axes IGNORES the size of the axes (eg X has a size from 10 to 100 above). On the contrary, it considers that the ENTIRE plot is a Cartesian product [0,1] X [0,1]. With add_axes we define the dimensions of the new plot on the old one.

For Example:

The semantics of the add_axes parameters are shown in the following figure:

Let's plot something in the sub-plot:

We can also change the scale of the axes to be logarithmic:

We can save a plot via plt.savefig. Depending on the extension in the file name, it will save it in a different format. CAUTION! you must call savefig BEFORE plt.show.

Some comments on file formats:

The complete example:

subplots

We can have miltiple plots aligned on a grid. This is possible by providing a numberof dimension in the plt.subplots:

For example: 3 vertical subplots:

3 horizontal subplots:

A 2X3 grid of plots:

We observe that ax is a 2X3 array:

So we can refer to any element of this array in order to "fill" some of these sub-plots:

Some notes:

Dendrogram

Another type of plots particularly useful in phylogenetics are dendrograms. For example suppose we have 10 objects and we get all pairwise distances:

plotly

Matplotlib belongs in the category of non-interactive plots. This is suitable for presentations / papers but not for visual exploration and inspection!

Very good libraries for data inspection are opened such as bokeh and plot.ly.

plotly needs installation:

!pip install plotly

Let's look at some examples with plotly:

A typical plot:

This plot is interactive. We can save it as html:

Scatter plot with plotly:

An example with "circular plots". The chloroplastic DNA of Arabidopsis thaliana. Data are available here: ftp://ftp.ensemblgenomes.org/pub/plants/release-49/gff3/arabidopsis_thaliana file: Arabidopsis_thaliana.TAIR10.49.chromosome.Pt.gff3. gz:

Seaborn

Seaborn is a library for "statistical graphics". That is, it provides methods for mapping distributions, histograms, clustering, etc.

Here is an example :https://seaborn.pydata.org/generated/seaborn.displot.html

Seaborn also allows the "beautification" of matplotlib's plots. Let's look at an example:

Without seaborn:

With seaborn:

Read here: http://seaborn.pydata.org/tutorial/aesthetics.html how we can choose different "styles" for our plots from seaborn: http://seaborn.pydata.org/tutorial/aesthetics.html

Networkx

Networkx is a library for editing and visualizing graphs. After installing it (pip install ...) we can make graphs as follows:

GeoPandas

GeoPandas is a pandas extension for visualization of geographic information.

Here is an example:

plotly GEO interactive

We can also make interactive plots with geographical information through plotly:

For the following example we need the file: https://www.dropbox.com/s/brqaopz8g2vs0ox/covid_fasta.gz?dl=1